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ABSTRACT 


Users  can  share  their  emotions  in  more  suitable  format  with  the  help  of  micro  blogging  services  like  twitter.  Twitter  provides  informa- 
tion about  individual's  real-time  feelings  through  the  data  resources  provided  by  individuals.  The  important  task  is  to  extract  user's 
tweets  and  perform  an  examination  and  survey.  However,  this  extracted  information  will  helpful  to  make  prediction  about  user's  opinion 
towards  specific  topics.  As  there  are  tremendous  amount  of  tweets  available  on  micro  blogging  services.  It  is  very  difficult  to  user,  so  the 
major  challenge  is  to  analyze  all  tweets  in  short  time.  In  this  paper  we  mainly  focus  on  solving  this  problem  with  the  naive  Bayes  tech- 
nique. This  paper  is  attempted  to  obtain  polarity  of  individual's  opinion  used  for  opinion  and  sentiment  analysis. 
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Introduction: 

Electronic  learning  has  become  faster  and  very  muchconvenient 
due  to  worldwide  and  availability  of  the  internet.  Thecustomer's 
reviews  are  increasing  various  in  number  on  variousproducts. 
These  large  numbers  of  reviews  are  beneficial  tomanufacturers 
and  organization  to  improve  their  quality  as  well  as  response.  It  is 
a complex  problem  for  users  to  read  all  reviews  to  make  a better 
decision  ofpurchasing.  It  is  helpful  available  customer  reviewsfor 
popular  products  from  various  product  review  sites  ofcustomer. 
There  has  been  a lot  of  research  activity  in  the  areas  ofopinion  min- 
ing and  sentiment  analysis.  Researchers  are  verymuch  interested 
in  performing  opinion  mining  and  sentimentanalysis  because  of 
the  increased  availability  of  machinelearning  techniques  used  in 
natural  language  processing, retrieval  of  information  and  the 
growth  of  online  customer's 

Review-aggregation. 

Problem  definition: 

With  popular  micro-blogging  services  like  Twitter,  users  are  able 
to  online  share  their  real-timefeelings  in  a more  convenient  way. 
The  user  generated  data  in  Twitter  is  thus  regarded  as  a resource 
providing  individual's  spontaneous  emotional  information,  and 
has  attracted  much  attention  of  researchers. Prior  work  has  mea- 
sured the  sentiment  expressions  in  user's  tweets  and  then  per- 
formed various  analysisand  learning.  In  this  paper,  we  mainly 
focus  on  solving  thisproblem  with  a Social  context  and  Topical  con- 
text  incorporated  Matrix  Factorization  framework. 
Theexperimental  results  on  a real-world  Twitter  data  set  show 
that  this  framework  outperforms  the  state-  of-the-art  collabora- 
tive filtering  methods,  and  demonstrate  that  both  social  context 
and  topical  context  are  effective  in  improving  the  user-topic  opin- 
ion prediction  performance. 

Literature  survey: 

In  last  decades  there  are  various  classification  techniques 
ofreview  available  for  deciding  polarity  of  reviews  from 
thereviews.  It  was  firstly  presented  in  [1],  where  polarity 
ofreviews  has  been  used  to  make  an  improved  decision 
ofpurchasing. 

Existing  system: 

1.  The  performance  of  opinion  mining  in  determining  the  orien- 


tations or  polarity  is  evaluated  by  calculating  various  metrics 
like  precision,  recall  and  F-measure.By  Rudy  the  overview  of 
the  work  done  in  the  task  of  opinion  mining  and  its  orienta- 
tions is  discussed  where  for  movie  review  data  mining  tech- 
nique is  used  as  SVM  which  provides  efficiency  of  89%  and  by 
Gang  li  using  k-means  clustering  performance  is  78%. 

2.  The  data  source  is  concerned,  a huge  amount  of  work  has  been 
done  on  movie  and  product  reviews  to  determine  the  opinion 
orientations.  The  Internet  Movie  Database  is  used  for  movie 
reviews  and  product  reviews  are  taken  from  Amazon.com. 
Movie  review  is  a more  challenging  application  than  many 
other  types  of  review  mining. 

3.  By  Gam  gam  for  amazon  review  data  mining  technique  is  maxi- 
mum entropy  is  used  which  gives  precision  72%, recall  78%  f- 
measureis  75%. 

4.  The  challenges  of  movie  review  based  on  the  factual  informa- 
tion which  is  always  mixed  with  real-life  data  and  mocking 
words  are  used  in  writing  movie  reviews.  The  Product  review 
domain  considerably  differs  from  movie  review  domain 
because  of  the  following  reasons. 

5.  One  reason  is  that  there  are  feature  specific  comments  in  prod- 
uct reviews  because  people  may  like  some  features  and  dislike 
others.  Thus  reviews  consists  opinions  orientations  in  the  text, 
which  is  a difficult  one  to  classifying  opinion  orientation  of 
review  as  positive  or  negative.  Following  feature  specific 
reviews  occur  less  often  in  movie  reviews. 

6.  Second  reason  is  that  there  are  a lot  of  comparative  sentences 
in  product  reviews  and  people  discuss  about  other  products  in 
reviews 


Proposed  system: 

1.  Comparing  with  the  content  of  other  sophisticated  social 
media,  the  improvisatory  short  messages  on  micro-blogging 
are  easier  to  obtain  and  more  likely  to  reflect  individuals  spon- 
taneous emotions.  Twitter,  1 as  one  of  the  most  famous  micro- 
blogging, hundreds  of  millions  of  people  freely  express  how 
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they  are  feeling  about  breaking  news,  public  figures,  hot  prod- 
ucts, or  just  daily  things  on  it  in  140-character  limit  tweets  (the 
messages  posted  by  Twitter  users)  every  day.  General,  the  sub- 
jective feelings  about  particular  matters  could  be  defined  as 
individual's  opinions,  which  are  considered  to  be  the  result  of 
emotion  and  play  an  important  role  during  the  decision- 
making process  most  of  the  time. 

2.  With  popular  micro-blogging  services  like  Twitter,  users  are 
able  to  online  share  their  real-time  feelings  in  a more  conve- 
nient way. 

Materials  and  Methods: 

External  Interface  Requirements: 

USER  INTERFACE  SYSTEMS  CAN  BE  BROADLY  CLASSI- 
FIED AS: 

1.  User  Initiated  Interface  the  user  is  in  charge,  controlling  the 
progress  of  the  user/computer  dialogue.  In  the  computer- 
initiated  interface,  the  computer  selects  the  next  stage  in  the 
interaction. 

2.  Computer  Initiated  Interfaces  the  computer  is  in  charge,  con- 
trolling the  progress  of  the  user/computer  dialogue.  Informa- 
tion is  displayed  and  the  user  response  of  the  computer  takes 
action  or  displays  further  information. 

Hardware  Interfaces: 

1)  1 GB  of  RAM  and  higher. 

2)  Processor  Intel  Pentium  4(1.5  GHz)  and  above  or  equivalent. 

3)  40  GB  of  Hard  disc  and  higher. 

4)  Internet  connection. 

Software  Interfaces: 

1)  Programming  Language:  Java 

2)  Tools:  JDK  1.6  or  above. 

3)  Operating  System:  Linux  operating  system 

Methods: 

Naive  Bayes  Classifier 

It's  a probabilistic  and  supervised  classifier  given  by  Thomas 
Bayes.  According  to  this  theorem,  if  there  are  two  events  say,  el 
and  e2  then  the  conditional  probability  of  occurrence  of  event  el 
when  e2  has  already  occurred  is  given  by  the  following  mathemati- 
cal formula: 

P (e2  | el)  P (el) 

P (el  | e2)  = 

P(e2) 

This  algorithm  is  implemented  to  calculate  the  probability  of  a 
data  to  be  positive  or  negative.  So,  conditional  probability  of  a sen- 
timent is  given  as: 

P (Sentiment  | Sentence)  P (Sentiment) 

P (Sentence  | Sentiment)  = 

P (sentiment) 

And  conditional  probability  of  a word  is  given  as: 

No  of  words  occurrence  in  class  + 1 

P (Sentiment  | Word)  = 

No  of  words  belongs  to  class  + Total  no  of  words 

Evaluation  of  Method: 

To  evaluate  the  method  following  measures  are  used: 

Accuracy,  Precision,  Recall,  Relevance. 

Following  contingency  table  is  used  to  calculate  the  various  mea- 
sures. 


Relevant  Irrelevant 


True  Positive  (tp) 

False  Positive  (fp) 

False  Negative  (fn) 

True  Negative  (tn) 

Precision  = tp  / tp  + fp 
Accuracy  = tp  + tn  / tp  + tn  + fp  + fn 
Recall  = tp  / tp  + fn 

System  Architecture: 


1. User  Interface:  User  interface  is  mainly  used  to  taking  input 
from  user, it's  a run-time  input  e.g.  product  name  from  user. 

2.  Twitter  API:  Based  upon  user  input  the  information  (re- 
views/tweets) will  be  collected  from  social  media  (twitter).  For  col- 
lection information  from  twitter  JAVA-API  of  twitter  will  be 
used,whichbasically  collect  all  the  information  up  to  the  current 
instance  of  time.  Twitter  API  provides  functionality  to  getting 
tweets  from  twitter. 

3.  Parser:  When  the  data  will  be  collected  from  twitter  API  it  will 
be  raw/XML  data, which  will  be  hard  to  analysis.  To  convert  raw 
XML  data  to  something  meaningful  data  i.e.  extract  only  meaning- 
ful data  from  that  whole  data  there  will  be  a parser  which  will  con- 
vert the  XML  data  to  meaningful  data. 

4.  POS  Tagger  (Part  of  Speech  Tagger):  It's  a java  library 
made  by  Stanford  University  to  parsing  the  sentence  into  part  of 
speech. In  opinion  mining  will  be  focusing  on  adjective, adverb, verb 
so  to  remove  the  unnecessary  part  of  speech  POS  tagger  will  be 
used  to  filter  out  these  things. 

5.  API  to  form  Input:  This  will  be  bridge  to  convert  the  filtered 
POS  in  to  the  input  form  of  classifier,  which  will  be  evaluated  later 
using  training  data-set. 

Training  Data-set:  The  previous  data  history  will  be  used  to 
train  the  machine  to  differentiate  the  polarities  on  words. E.g. 
Good,  positive  Gd,  positive  bd,  negative,  not  well,  negative, :)  —pos- 
itive, :(-negative. 

Based  on  training  data-set  the  input  will  be  processed. 
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Predict  Output:  The  input  will  be  processed  based  on  trained 
data  set  for  predicting  the  polarities  on  user  reviews,  e.g.  positive 
or  negativeness  In  the  form  of  graphical. 

Discussion: 

In  this  paper,  we  focus  on  a challenging  problem  of  predicting 
users'  opinions  toward  topics  they  had  not  directly  given  yet,  which 
we  define  as  user-topic  opinion  prediction.  The  main  contributions 
of  this  paper  are  as  follows:  1)  Different  from  previous  work  recog- 
nizing emotional  states/sentiments  from  online  micro-blogging 
data  but  ignoring  whose  they  are,  we  seek  to  find  out  who  has  what 
opinion  of  a specific  topic  in  advance.  We  believe  that  predicting 
individual's  feeling  about  a given  target  is  important  for  affective 
computing  studies  and  able  to  be  used  to  various  applications.  2) 
To  provide  a solution,  we  consider  the  opinion  among  Twitter 
social  friends  and  users'  opinion  consistency  on  content-related  top- 
ics, and  formulate  them  as  social  context  and  topical  context  math- 
ematically. 3)  Utilizing  the  learned  emotional  knowledge  from  the 
observed  tweets  and  the  social  and  topical  context  information,  we 
propose  a Naive  Bayes  classification  method  to  predict  the 
unknown  user-topic  opinions. 
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